Cross-lingual information retrieval systems

نویسنده

  • Shadi Saleh
چکیده

In this work, we will explore different approaches used in Cross-Lingual Information Retrieval (CLIR) systems. Mainly, CLIR systems which use statistical machine translation (SMT) systems to translate queries into collection language. This will include using SMT systems as a black box or as a white box, also the SMT systems that are tuned towards better CLIR performance. After that, we will present our approach to rerank the alternative translations using machine learning regression model. This includes also introducing our set of features which we used to train the model. After that, we adapt this reranker for new languages. We also present our query expansion approach using word-embeddings model that is trained on medical data. Finally we reinvestigate translating the document collection into query language, then we present our future work.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Query Translation for Cross-lingual Information Retrieval using Wikipedia

In this paper the system WikiTranslate is introduced that performs query translation for cross-lingual information retrieval (CLIR) that only uses Wikipedia. Queries will be mapped to Wikipedia concepts and the corresponding translations of these concepts in the target language are used to create the final query. WikiTranslate is evaluated by searching with topics in Dutch, French and Spanish i...

متن کامل

Modern Multilingual and Cross-lingual Information Access Technologies

In this chapter, we describe the state of the art cross-lingual and multilingual strategies and their related areas. In particular, we show a WWW-based information system called MIETTA, which allows uniform and multilingual access to heterogeneous data sources in the tourism domain. The design of the search engine is based on a new cross-lingual framework. The framework integrates a cross-lingu...

متن کامل

Using Information Extraction to Improve Cross-lingual Document Retrieval

We present a filtering mechanism using two cross-lingual information extraction (CLIE) systems for improving document relevance of cross-lingual information retrieval (CLIR) for queries conforming to predefined templates. Experiments on retrieving Chinese documents in response to English GALE arrest queries show that this approach can obtain a 12.7% absolute improvement in relevance (representi...

متن کامل

Leveraging User Interaction and Social Tagging for Improving Cross-lingual Information Access in Digital Libraries

Evaluation of interactive cross-lingual information retrieval systems has been the focus of recent research. The goal is to support the users in formulating effective queries and selecting the documents which satisfy their information needs regardless of the language of the documents. This dissertation aims at harnessing the user-system interaction, extracting the added value and integrating it...

متن کامل

Ricoh at CLEF 2004

Abstract. This paper describes the participation of RICOH in the monolingual and cross-lingual information retrieval tasks on German Indexing and Retrieval Testdatabase (GIRT) in the Cross-Language Evaluation Forum (CLEF) 2004. We used a morphological analyzer for word decompounding and parallel corpora for cross-lingual information retrieval. The performance of cross-lingual information retrie...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017